Machine learning based prediction of recurrence after curative resection for rectal cancer

Purpose Patients with rectal cancer without distant metastases are typically treated with radical surgery. Post curative resection, several factors can affect tumor recurrence. This study aimed to analyze factors related to rectal cancer recurrence after curative resection using different machine learning techniques. Methods Consecutive patients who underwent curative surgery for rectal cancer between 2004 and 2018 at Gil Medical Center were included. Patients with stage IV disease, colon cancer, anal cancer, other recurrent cancer, emergency surgery, or hereditary malignancies were excluded from the study. The Synthetic Minority Oversampling Technique with Tomek link (SMOTETomek) technique was used to compensate for data imbalance between recurrent and no-recurrent groups. Four machine learning methods, logistic regression (LR), support vector machine (SVM), random forest (RF), and Extreme gradient boosting (XGBoost), were used to identify significant factors. To overfit and improve the model performance, feature importance was calculated using the permutation importance technique. Results A total of 3320 patients were included in the study. After exclusion, the total sample size of the study was 961 patients. The median follow-up period was 60.8 months (range:1.2–192.4). The recurrence rate during follow-up was 13.2% (n = 127). After applying the SMOTETomek method, the number of patients in both groups, recurrent and non-recurrent group were equalized to 667 patients. After analyzing for 16 variables, the top eight ranked variables {pathologic Tumor stage (pT), sex, concurrent chemoradiotherapy, pathologic Node stage (pN), age, postoperative chemotherapy, pathologic Tumor-Node-Metastasis stage (pTNM), and perineural invasion} were selected based on the order of permutational importance. The highest area under the curve (AUC) was for the SVM method (0.831). The sensitivity, specificity, and accuracy were found to be 0.692, 0.814, and 0.798, respectively. The lowest AUC was obtained for the XGBoost method (0.804), with a sensitivity, specificity, and accuracy of 0.308, 0.928, and 0.845, respectively. The variable with highest importance was pT as assessed through SVM, RF, and XGBoost (0.06, 0.12, and 0.13, respectively), whereas pTNM had the highest importance when assessed by LR (0.05). Conclusions In the current study, SVM showed the best AUC, and the most influential factor across all machine learning methods except LR was found to be pT. The rectal cancer patients who have a high pT stage during postoperative follow-up are need to be more close surveillance.


Introduction
Colorectal cancer is a common malignant disease having the third highest incidence and second highest mortality rates worldwide [1].Rectal cancer, accounts for approximately onethird of all colorectal cancers and has a relatively higher recurrence rates than colon cancer.This is due to the lower rectum being devoid of serosa which protects against tumor invasion through the muscle layer, and it is also technically more demanding to obtain a sufficient safety margin [2].The 5-year recurrence rate of locally advanced rectal cancer after curative surgery is reported to be in the range of 6-27.5% [3].Such a high rate is associated with both tumorand treatment-related factors.Early detection and immediate treatment of rectal cancer recurrence may prevent patients from entering a dismal stage.Therefore, clinicians need to identify the factors that increase the risk of rectal cancer recurrence and be more alert during the follow-up period after surgery.
In the recent years, artificial intelligence has been in the spotlight in varied fields, with its applications in the medical field rapidly progressing.Machine learning based algorithms, which forms the basis of artificial intelligence, have been developed over the past decades for predicting disease risk, prognosis, diagnosis, and even the course of treatment in healthcare settings [4].Further, recent studies have reported the feasibility and utility of artificial intelligence-based predicting the recurrence of several malignant diseases, including colorectal, breast, and gastric cancer [5][6][7][8][9][10].However, in colorectal cancer, only a few studies employing machine-learning methodologies focus exclusively on recurrence prediction for rectal cancer without including colon cancer.Hence, we aimed to compare four different machine learning algorithms in terms of performance and accuracy in predicting significant risk factors for the recurrence of rectal cancer after curative resection.

Patient selection and dataset
We used the colorectal cancer surgery database, which was retrospectively collected from the Clinical Research Data Warehouse (CRDW) at the Gil Medical Center.The data were accessed for research purpose since August 27, 2021.All data has been anonymized so that individual participant could not be identified.The database included 3320 consecutive patients who underwent surgery for colorectal cancer between January 2004 and December 2018.From the databases, we identified patients who underwent curative surgery (R0 or R1 resection which means tumor free resection margin or resection margin with microscopic residual tumor and without macroscopic residual tumor) for rectal cancer.Patients with stage IV disease, colon cancer, anal cancer, recurrent cancer, emergency surgery, or hereditary malignancies were excluded from the study.After exclusion, 961 patients remained eligible for the study (

Ethics and consent
This study obtained institutional review board approval from the Ethics Review Committee of the Gil Medical Center (approval no.GAIRB2021-316).All procedures were performed in accordance with the ethical standards of Gil Medical Center at Gachon University, and the 1964 Declaration of Helsinki and its later amendments.Because of the retrospective nature of the study, the need to obtain informed consent was waived for the individual participants by the Ethics Review Committee.

Compensating for data imbalances
In this study, we employed the Synthetic Minority Oversampling Technique with Tomek link (SMOTETomek) technique to address the data imbalance issue between the recurrence and no-recurrence groups.SMOTETomek combines oversampling and under sampling techniques, utilizing SMOTE for oversampling and the Tomek link for under sampling.SMOTE employs the k-nearest neighbor (KNN) algorithm to identify minority classes and generates new samples with randomly assigned values ranging from 0 to 1.The Tomek link eliminates samples belonging to the majority class from pairs of neighboring samples of different classes [11].By utilizing the SMOTETomek technique, we sampled 1334, with 667 in the relapsed group and 667 in the non-relapsed group, effectively addressing and accounting for the data imbalance.

Potential predictors
The database included 43 clinical features, and surgeons initially selected 16 features that were considered clinically related to rectal cancer recurrence.The following features were analyzed by the machine learning techniques: patient baseline characteristics {age, sex, American Society of Anesthesiologists score (ASA), body mass index (BMI), and initial carcinoembryonic antigen (CEA)}, treatment related factors (CCRT, and postoperative chemotherapy), and tumor related factors {location of rectal cancer, histologic type, pathologic Tumor stage (pT), pathologic Node stage (pN), pathologic Tumor-Node-Metastasis stage (pTNM), lymphovascular invasion (LVI), perineural invasion (PNI), involvement of distal resection margin, and harvested lymph nodes).Tumor stage was defined according to the American Joint Committee on Cancer (AJCC) 8 th edition [12].All continuous variables were converted to incategorical variables according to their clinical significance: Age was divided into < 65, and � 65 years; BMI was divided into < 25, and � 25 kg/m2; Initial CEA was divided into < 5, and � 5ng/ml; The number of harvested lymph nodes was divided into < 12, and � 12. None of the included variables had any missing values.

Machine learning algorithms
Logistic regression (LR) is an algorithm that applies a logistic function to the coefficients obtained from linear regression to classify the values.It uses a linear combination of each independent variable to make a probability prediction and is classically and widely used to identify risk factors in medical research [13].Support vector machine (SVM) is an algorithm that converts input data into high-dimensional spatial data and then determines the optimal decision boundary that maximizes the distance between data classes [14].Further, Random Forest (RF) is an ensemble model that builds on the Decision Tree model.It creates multiple decision trees and aggregates the results of each tree using an ensemble technique to make a final decision [15].Extreme gradient boosting (XGBoost) is an algorithm that addresses the shortcomings of the Gradient Boosting algorithm and is known for its speed and superior prediction performance compared with other models.Internal cross-validation was performed at each iteration to prevent overfitting [16].

Feature selection
In this study, we employed a permutation-importance technique for the feature selection.Permutation importance is a method commonly used in machine learning to assess the significance of model features, offering the advantage of applicability to any type of model.This technique quantifies the increase in prediction error when the values of the features are randomly permuted, thus breaking the relationship between the features and the actual outcome.By observing the increase in the model error for each feature, we gained some insights into the dependency of that particular attribute [17].We utilized permutation importance to select features from a pool of 16, ultimately identifying 8 key features: PNI, pTNM, postoperative chemotherapy, age, pN, CCRT, sex, and pT (Fig 2).

Optimal combination of hyperparameter
In this study, we used a grid search technique to tune the hyperparameters of each machine learning model.A grid search is an exploratory technique that determines the optimal combination of hyperparameter values by exploring all possible combinations [18].We utilized a grid search to combine hyperparameter values for each model and cross-validated each combination using the training data to select the parameter combination exhibiting the best area under the curve (AUC) performance.

Model performance comparison
After feature selection based on permutation importance, four machine learning algorithms were trained with selected features of the training dataset (n = 1334).For model performance comparison, the following indices were used: sensitivity, specificity, accuracy, and AUC.
For machine learning, statistical analysis, and performance validation, we used Python software (version 3.

Baseline patient demographics
A total of 961 patients were included in the study.The median follow-up period was 60.8 months (range:1.2-192.4).The recurrence rate during follow-up was 13.2% (n = 127).In the chi-square test, age, initial CEA level, pT, LVI, PNI, pN, pTNM, and postoperative chemotherapy were statistically significant (p < 0.05).The baseline patient demographics are shown in Table 1.

Feature importance depending on machine learning methods
Fig 5 shows the respective values of feature importance in accordance with the machine learning models based on permutational importance.The variable with the highest importance was pT, as assessed by SVM, RF, and XGBoost (0.06, 0.12, and 0.13, respectively), whereas pTNM had the highest importance in LR (0.05).In the SVM, pT and sex had the highest values (0.06).

Discussion
In this study, we analyzed the factors associated with recurrence performed by four machine learning algorithms using 15-years database of consecutive rectal cancer patients who underwent curative surgery.Although SVM showed the best performance (AUC = 0.831), other machine learning methods also had comparable AUC values of more than 0.8.The comparison of the AUC performances among the various machine learning models did not yield a statistically significant difference (p = 0.274).Thus, the focus is primarily on the AUC values.Among the models evaluated, the SVM demonstrated the highest AUC at 0.831, followed by the RF with an AUC of 0.826, LR at 0.811, and XGBoost at 0.804.Based on these results, the SVM can be considered the most effective model for predicting recurrence in this study.In SVM, RF, and XGBoost, pT was the top-ranked feature of importance, whereas pTNM showed the highest feature importance in LR.Their characteristics were similar in terms of pathologic tumor stage.It is strongly suggested that pathologic tumor stage is the most influential predictor of rectal cancer recurrence after curative resection.Tumor stage is a well-known and established prognostic factor for most malignant diseases [19].Especially in locally advanced rectal cancer, oncologists try to decrease the tumor stage through CCRT because tumor response with complete response or down-staging provides better oncologic outcomes [20].In this regard, there are several studies to enhance the efficacy of CCRT with additional preoperative methods [21,22].Our findings confirm again that tumor stage is a strongly important factor in the recurrence of rectal cancer.In all machine learning methods except LR, the first-and second-highest feature importance were pT and sex.According to AJCC 8 th edition, T3 is defined as 'tumor invades through the muscularis propria into pericolorectal tissues,' and T4a is defined as 'tumor penetrates to the surface of the visceral peritoneum' [12].Because the lower rectum has no visceral peritoneum, T3 tumors can involve the mesorectal fascia.Therefore, the T stage is a more influential factor in rectal cancer than in colon cancer, which may be reflected in our results.Male sex was another highranked risk factor in this study.Previous studies have reported that male sex is a significant predictor for recurrence in colorectal cancer [23][24][25].According to Demb et al., male sex had significantly higher odds ratio relative to the female sex for colorectal cancer recurrence, and the odds ratio was higher for rectal cancer (OR = 2.84) compared to the distal colon cancer (OR = 1.84) [25].This implies that clear surgical resection is more challenging in male patients with rectal cancer because the pelvic cavity in men is narrower and deeper than female patients.
Although CCRT and adjuvant chemotherapy were not high-ranked feature importance in our study, they are conventionally crucial role to improve survival outcomes in rectal cancer.Therefore, we considered distinguishing detailed regimen of perioperative therapies because anticancer treatment has developed with time.However, consequently, it was no need to classify detailed treatment in this study because we included only non-metastatic rectal cancer patients.Furthermore, every cancer patient except involving clinical trials, received chemotherapy or radiotherapy according to the government guideline, because cancer treatment in South Korea is totally covered by national health insurance.For those reasons, palliative chemotherapeutic agents such as target agents or multikinase inhibitors, and totally neoadjuvant treatment were not considered.From 1980s to now, 5-Fluorouracil/leucovorin have become main regimen of adjuvant chemotherapy for rectal cancer [25][26][27].Exceptionally, 5-fluorouracil/leucovorin with oxaliplatin (FOLFOX) regimen has been covered by national health insurance since August 2016, after ADORE study suggested that FOLFOX showed better oncologic outcomes than the conventional 5-fluorouracil/leucovorin for stage II,III rectal cancer patients who received CCRT [28].In our study, because the patients who received adjuvant FOLFOX chemotherapy after August 2016 were only 4.1% (n = 40/961), it is thought to be a negligible impact to evaluate the predictive factors of recurrence.
All machine learning models performed reliably, with no statistically significant differences in performance (p = 0.274).The SVM demonstrated the highest AUC performance, whereas the RF may be a better choice when considering sensitivity and specificity.RF achieved the second-best performance with an AUC of 0.826, and the difference between sensitivity and specificity was smaller compared to SVM.SVM exhibited relatively large discrepancies in sensitivity (0.692) and specificity (0.814), indicating the potential presence of bias in training compared to RF.However, owing to the limited size of the test data, it is not possible to definitively conclude that the SVM is more biased.SVM models excel in identifying complex decision boundaries within high-dimensional datasets containing numerous features.This capability stems from their efficient margin maximization between classes, enhancing the model's generalizability and resilience against outliers.However, SVM models face challenges with large datasets due to their time-intensive nature and the complexity involved in selecting optimal kernels and parameters.Conversely, they exhibit high efficiency with smaller datasets.In contrast, tree-based models such as RF and XGBoost, while effective, may encounter overfitting issues in high-dimensional spaces.The distinct attributes of SVM models proved advantageous in our analysis of the colon cancer surgery database.This led to SVM models achieving the highest AUC score, suggesting their superiority in predicting rectal cancer recurrence within the context of this specific database.
This study had several limitations.First, this was a single-center retrospective study, and selection bias could not be excluded.Secondly, the analysis was performed using only a limited number of factors.There were no other clinically significant factors, such as smoking status, tumor regression grade after CCRT, mesorectal fascia involvement, or various molecular biomarker statuses (ras or microsatellite instability).We attempted to analyze as many factors as possible; however, there were many factors with more than 20% missing data.Factors with large proportions of missing data were excluded to improve the quality of the database.Consequently, no data were missing in our study.Third, there was an imbalance in the data ratios between the recurrence and non-recurrence groups.We employed the SMOTETomek technique to address this imbalance; however, it has limitations in fully resolving the underlying problem.The amount of data available for testing in the recurrence group was insufficient for adequate validation.Further research involving cross-validation is required to address these issues.Future studies should focus on collecting additional data from recurrence groups, and the generalization of the model should be addressed through the collection and validation of multicenter data.Finally, we did not distinguish between the p and yp stages (i.e., pathologic findings following preoperative systemic chemotherapy or radiation prior to surgery as a primary treatment) in the pathologic tumor stage.Because the tumor stage could decrease after CCRT, the p-stage could be underestimated in patients treated with CCRT.However, despite of some limitations, we tried to evaluate risk factors for recurrence, focusing on rectal cancer who underwent curative resection, using various machine learning techniques without missing data.Our study has the strength in terms of improving the quality of analysis through multiple machine learning methods, compared to other studies that usually evaluated by single analysis method, LR.

Conclusions
In this study, we analyzed and compared the importance of risk factors for rectal cancer recurrence using four different machine learning methods.We found that various machine learning methods increased the predictive validity of rectal cancer recurrence.The SVM showed the best AUC value.The most influential factor was pT for all machine learning methods, except for LR.Clinicians should be more alert if patients have a high pT stage during postoperative follow-up.
Fig 1).All of the included patients underwent total mesorectal excision by open or laparoscopic approach depended on surgeon's preference.The patients who have clinical tumor stage 3-4 (cT3-4) with any clinical node stage (cNany) or any clinical tumor stage (cTany) with clinical positive node stage (cN1-2) received concurrent chemoradiotherapy (CCRT) at 8 to 10 weeks before surgery.Adjuvant chemotherapy was determined by multidisciplinary discussion considering final pathologic stage, and patient's clinical condition.There were 834 and 127 patients in the no-recurrence and recurrence groups, respectively.For model training, the overall database was divided into training and testing datasets.Randomly selected each 20% of data from the recurrence and no-recurrence groups were used as the test dataset (n = 193), and the remaining data were used as a training dataset (n = 768).

Fig 2 .
Fig 2. Eight high-ranked features for rectal cancer recurrence after curative surgery by the mean value of permutation importance in four machine learning methods.https://doi.org/10.1371/journal.pone.0290141.g002 4.